Addressing surprisal deficiencies in reading time models

نویسندگان

  • Marten Van Schijndel
  • William Schuler
چکیده

This study demonstrates a weakness in how n-gram and PCFG surprisal are used to predict reading times in eye-tracking data. In particular, the information conveyed by words skipped during saccades is not usually included in the surprisal measures. This study shows that correcting the surprisal calculation improves n-gram surprisal and that upcoming n-grams affect reading times, replicating previous findings of how lexical frequencies affect reading times. In contrast, the predictivity of PCFG surprisal does not benefit from the surprisal correction despite the fact that lexical sequences skipped by saccades are processed by readers, as demonstrated by the corrected n-gram measure. These results raise questions about the formulation of information-theoretic measures of syntactic processing such as PCFG surprisal and entropy reduction when applied to reading times.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexical surprisal as a general predictor of reading time

Probabilistic accounts of language processing can be psychologically tested by comparing word-reading times (RT) to the conditional word probabilities estimated by language models. Using surprisal as a linking function, a significant correlation between unlexicalized surprisal and RT has been reported (e.g., Demberg and Keller, 2008), but success using lexicalized models has been limited. In th...

متن کامل

Early effects of word surprisal on pupil size during reading

This study investigated the relation between word surprisal and pupil dilation during reading. Participants’ eye movements and pupil size were recorded while they read single sentences. Surprisal values for each word in the sentence stimuli were estimated by both a recurrent neural network and a phrasestructure grammar. Higher surprisal corresponded to longer word-reading time, and this effect ...

متن کامل

Word surprisal predicts N400 amplitude during reading

We investigated the effect of word surprisal on the EEG signal during sentence reading. On each word of 205 experimental sentences, surprisal was estimated by three types of language model: Markov models, probabilistic phrasestructure grammars, and recurrent neural networks. Four event-related potential components were extracted from the EEG of 24 readers of the same sentences. Surprisal estima...

متن کامل

Surprisal-based comparison between a symbolic and a connectionist model of sentence processing

The ‘unlexicalized surprisal’ of a word in sentence context is defined as the negative logarithm of the probability of the word’s part-of-speech given the sequence of previous partsof-speech of the sentence. Unlexicalized surprisal is known to correlate with word reading time. Here, it is shown that this correlation grows stronger when surprisal values are estimated by a more accurate language ...

متن کامل

Predictive power of word surprisal for reading times is a linear function of language model quality

Within human sentence processing, it is known that there are large effects of a word’s probability in context on how long it takes to read it. This relationship has been quantified using informationtheoretic surprisal, or the amount of new information conveyed by a word. Here, we compare surprisals derived from a collection of language models derived from n-grams, neural networks, and a combina...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016